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NAVAL  C3  DISTRIBUTED  TACTICAL  DECISIONMAKING 


1.  PROJECT  OBJECTIVES 


The  objective  of  the  research  is  to  address  analytical  and  computational  issues  that  arise  in  the 
modeling,  analysis  and  design  of  distributed  tactical  decisionmaking.  The  research  plan  has  been 
organized  into  two  highly  interrelated  research  areas: 

(a)  Distributed  Tactical  Decision  Processes; 

(b)  Distributed  Organization  Design. 

The  focus  of  the  first  area  is  the  development  of  methodologies,  models,  theories  and  algorithms 
directed  toward  the  derivation  of  superior  tactical  decision,  coordination,  and  communication 
strategies  of  distributed  agents  in  fixed  organizational  structures.  The  framework  for  this  research 
is  normative. 

The  focus  of  the  second  area  is  the  development  of  a  quantitative  methodology  for  the  evaluation 
and  comparison  of  alternative  organizational  structures  or  architectures.  The  organizations 
considered  consist  of  human  decisionmakers  with  bounded  rationality  who  are  supported  by  C3 
systems.  The  organizations  function  in  a  hostile  environment  where  the  tempo  of  operations  is 
fast;  consequently,  the  organizations  must  be  able  to  respond  to  events  in  a  timely  manner.  The 
framework  for  this  research  is  descriptive. 

/' 

/ 

2.  STATEMENT  OF  WORK 


The  research  program  has  been  organized  into  seven  technical  tasks  --  four  that  address  primarily 
the  theme  of  distributed  tactical  decision  processes  and  three  that  address  the  design  of  distributed 
organizations.  An  eighth  task  addresses  the  integration  of  the  results.  They  are: 

2. 1  Real  Time  Situation  Assessment:  Static  hypothesis  testing,  the  effect  of  human  constraints 
and  the  impact  of  asynchronous  processing  on  situation  assessment  tasks  will  be 
explored. 

2.2  Real  Time  Resource  Allocation:  Specific  research  topics  include  the  use  of  algebraic 
structures  for  distributed  decision  problems,  aggregate  solution  techniques  and 
coordination. 
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2.3  Impact  of  Informational  Discrepancy:  The  effect  on  distributed  decisionmaking  of 
different  tactical  information  being  available  to  different  decisionmakers  will  be  explored. 
The  development  of  an  agent  model,  the  modeling  of  disagreement,  and  the  formulation 
of  coordination  strategies  to  minimize  disagreement  are  specific  research  issues  within  this 
task. 

2.4  Constrained  Distributed  Problem  Solving;  The  agent  model  will  be  extended  to  reflect 
human  decisionmaking  limitations  such  as  specialization,  limited  decision  authority,  and 
limited  local  computational  resources.  Goal  decomposition  models  will  be  introduced  to 
derive  local  agent  optimization  criteria.  This  research  will  be  focused  on  the  formulation 
of  optimization  problems  and  their  solution. 

2.5  Evaluation  of  Alternative  Organizational  Architectures:  This  task  will  address  analytical 
and  computational  issues  that  arise  in  the  construction  of  the  generalized 
performance-workload  locus.  This  locus  is  used  to  describe  the  performance 
characteristics  of  a  decisionmaking  organization  and  the  workload  of  individual 
decisionmakers. 

2.6  Asynchronous  Protocols:  The  use  of  asynchronous  protocols  in  improving  the  timeliness 
of  the  organization's  response  is  the  main  objective  of  this  task.  The  tradeoff  between 
timeliness  and  other  performance  measures  will  be  investigated. 

2.7  Information  Support  Structures:  In  this  task,  the  effect  of  the  C3  system  on  organizational 
performance  and  on  the  decisionmaker's  workload  will  be  studied. 

2.8  Integration  of  Results:  A  final,  eighth  task,  is  included  in  which  the  various  analytical  and 
computational  results  will  be  interpreted  in  the  context  of  organizational  bounded 
rationality. 

3.  STATUS  REPORT 

In  the  context  of  the  first  seven  tasks  outlined  in  Section  2,  a  number  of  specific  research 

problems  have  been  formulated  and  are  being  addressed  by  graduate  research  assistants  under  the 

supervision  of  project  faculty  and  staff.  Research  problems  which  were  completed  prior  to  or 

were  not  active  during  this  last  quarter  have  not  been  included  in  the  report. 
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3.1  DISTRIBUTED  TEAM  HYPOTHESIS  TESTING  WITH  EXPENSIVE 
COMMUNICATIONS 

Background:  In  Command-Control-and-Communication  (C^)  systems  multiple  hypothesis  testing 
problems  abound  in  the  surveillance  area.  Targets  must  be  detected  and  their  attributes  must  be 
established;  this  involves  target  discrimination  and  identification.  Some  target  attributes,  such  as 
location,  are  best  observed  by  sensors  such  as  radar.  More  uncertain  target  locations  are  obtained 
by  passive  sensors,  such  as  sonar  or  IR  sensors.  However,  target  identity  information  requires 
other  types  of  sensors  (such  as  ESM  receivers,  IR  signature  analysis,  human  intelligence  etc).  As 
a  consequence  in  order  to  accurate  locate  and  identify  a  specific  target  out  of  a  possibly  large 
potential  population  (including  false  targets)  one  must  design  a  detection  and  discrimination 
system  which  involves  the  fuzing  of  information  from  several  different  sensors  generating 
possibly  specialized  information  about  the  target.  These  sensors  may  be  collocated  on  a  platform 
(say  a  ship  in  a  Naval  battle  group)  or  be  physically  dispersed  as  well  (ESM  receivers  exist  in 
every  ship,  aircraft,  and  submarine).  The  communication  of  information  among  this  diverse 
sensor  family  may  be  difficult  (because  of  EMCON  restrictions)  and  is  vulnerable  to  enemy 
countermeasure  actions  (physical  destruction  and  jamming).  It  is  this  class  of  problems  that 
motivates  our  research  agenda. 

To  put  it  another  way  the  fusion  of  information  derived  from  dispersed  sensors  and  decision 
nodes  requires  communication.  To  discourage  nonessential  communication  we  would  like  to  put  a 
price  on  each  transmitted  bit.  In  this  manner,  extensive  communications  would  occur  only  if  the 
decision  warrants  them. 

Research  Goals:  We  are  conducting  research  on  distributed  multiple  hypothesis  testing  using 
several  decision-makers,  and  teams  of  decision-makers,  with  distinct  private  information  and 
limited  communications.  This  is  the  simp]  ’.st  possible  non-trivial  distributed  decision  problem, 
whose  centralized  counterpart  is  well  understood  and  straight-forward  to  compute.  The  goal  of 
this  research  is  to  unify  our  previous  research  in  situation  assessment,  distributed  hypothesis 
testing,  and  impact  of  informational  discrepancy;  and  to  extend  the  methodology,  mathematical 
theory  and  computational  algorithms  so  that  we  can  synthesize  and  study  more  complex 
organizational  structures.  The  solution  of  this  class  of  basic  research  problems  will  have  impact  in 
structuring  the  distributed  architectures  necessary  for  the  detection,  discrimination,  identification 
and  classification  of  attributes  of  several  targets  (or  events)  by  a  collection  of  distinct  sensors  (or 
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dispersed  human  observers). 

The  objective  of  the  distributed  organization  will  be  the  resolution  of  several  possible  hypotheses 
based  on  many  uncertain  measurements.  Each  hypothesis  will  be  characterized  by  several 
attributes.  Each  attribute  will  have  a  different  degree  of  observability  to  different  decision  makers 
or  teams  of  decision  makers;  in  this  manner,  we  shall  model  different  specialization  expertise 
associated  with  the  detection  and  resolution  of  different  phenomena.  Since  each  hypothesis  will 
have  several  attributes,  it  follows  that  in  order  to  reliably  confirm  or  reject  a  particular  hypothesis, 
two  or  more  decision-makers  (or  two  or  more  teams  of  decision-makers)  will  have  to  pool  and 
fuze  their  knowledge. 

Extensive  and  unecessary  communication  among  the  decision-makers  will  be  discouraged  by 
explicitly  assigning  costs  to  certain  types  of  communication.  In  this  manner,  we  shall  seek  to 
understand  and  isolate  which  communications  are  truly  vital  in  the  organizational  performance;  the 
very  problem  formulation  will  discourage  communications  whose  impact  upon  performance  is 
minimal.  Quantitative  tradeoffs  will  be  sought. 

We  stress  that  we  shall  strive  to  design  distributed  organizational  architectures  in  which  teams  of 
teams  of  decision-makers  interact.  For  example,  a  team  may  consist  of  a  primary  decision-maker 
together  with  a  consulting  decision-maker  -  the  paradigm  used  by  Papastavrou  and  Athans. 

The  methodology  that  we  plan  to  employ  will  be  mathematical  in  nature.  To  the  extent  possible  we 
shall  formulate  the  problems  as  mathematical  optimization  problems.  Thus,  we  seek  normative 
solution  concepts.  To  the  extent  that  human  bounded  rationality  constraints  are  available,  these 
will  be  incorporated  in  the  mathematical  problem  formulation.  In  this  case,  the  nature  of  the 
results  will  correspond  to  what  is  commonly  refered  to  as  normativeldescriptive  solutions. 
Therefore,  we  visualize  a  dual  benefit  of  our  basic  research  results.  From  a  purely  mathematical 
point  of  view,  the  research  will  yield  nontrivial  advances  to  the  distributed  hypothesis-testing 
problem;  an  very  difficult  problem  from  a  mathematical  point  of  view.  From  a  psychological 
perspective,  we  hope  that  the  normative  results  will  suggest  counterintuitive  behavioral  patterns  of 
-  even  perfectly  rational  --  decision-makers  operating  in  a  distributed  tactical  decision-making 
environment;  these  will  set  the  stage  for  designing  empirical  studies  and  experiments  and  point  to 
key  variables  that  should  be  observed,  recorded  and  analyzed  by  cognitive  scientists.  From  a 
military  viewpoint,  the  results  will  be  useful  in  structuring  distributed  architectures  for  the 
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surveillance/discrimination  function. 

Progress  during  the  past  quarter:  In  the  past  quarter  we  completed  the  investigation  of  the 
problem  of  ternary  hypothesis  testing  by  a  team  of  two  cooperating  decision  makers; 
communication  between  the  two  decision-makers  is  costly  and  consists  of  a  finite  alphabet.  The 
problem  is  to  distinguish  among  three  different  hypotheses.  Each  decision-maker  obtains  an 
uncertain  measurement  of  the  true  hypothesis.  The  so-called  primary  decision-maker  has  the 
option  of  making  the  final  team  decision  or  consulting,  at  a  cost,  the  consulting  decision-maker. 
The  consulting  decision-maker  is  constrained  to  provide  information  using  a  ternary  alphabet.  The 
team  objective  is  to  minimize  the  probability  of  error  together  with  the  communications  cost  (if 
any).  Mr.  Papastavrou,  under  the  supervision  of  Prof.  Athans,  has  derived  all  necessary 
equations.  However,  due  to  the  severe  complexity  of  these  equations,  we  decided  not  to  write  the 
necessary  software  for  their  solution  at  the  present  time. 

Mr.  Papastavrou  and  Professor  Athans  have  initiated  the  investigation  of  a  class  of  distributed 
decision  problems  originally  analyzed  by  L.  Ekchian  in  his  Ph.D.  thesis  (1983).  Consider  the 
problem  of  binary  hypothesis  testing  by  two  decision  makers  (DMs)  connected  in  tandem.  The 
"upstream"  DM  communicates  his  conclusion  to  the  "downstream"  DM  who  then  blends  his 
measurement  with  the  "upstream”  decision,  and  generates  the  final  decision  for  the  team.  The 
quality  of  each  DM  can  be  quantified  by  his  receiver  operating  characteristic  (ROC)  curve. 
Dominance  of  the  ROC  curves  can  be  used  to  indicate  that  a  particular  DM  is  clearly  better  than  the 
other  one.  Ekchian  had  posed  the  (reasonable)  conjecture  that  the  better  DM  should  be  the 
downstream  one.  We  have  been  able  to  verify  this  conjecture  for  a  class  of  gaussian  problems, 
and  we  are  attemting  to  either  prove  the  conjecture  in  general,  or  construct  a  counter-example. 
This  line  of  inquiry  is  important  because  it  would  point  out  how  relative  expertise  of  DMs  should 
impact  organizational  design. 

Mr.  Pothiawala  and  Professor  Athans  have  also  examined  the  above  problem  under  the 
assumption  that  the  upstream  DM  is  allowed  to  communicate  with  more  than  two  bits  his  tentative 
decision  to  the  downstream  DM.  We  seek  to  understand  the  value  of  each  additional  bit  of 
communicated  information  to  the  overall  improvement  of  the  distributed  team  objective  (e.g.  the 
weighted  probability  of  error). 

Documentation:  We  have  started  a  paper  on  the  binary  hypothesis  testing  problem  for  presentation 
at  the  upcoming  JDL  C2  Symposium  in  June  1988.  An  abstract  has  been  submitted. 
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3.2  DISTRIBUTED  HYPOTHESIS  TESTING  WITH  MANY  AGENTS 

Background:  The  goal  of  this  research  project  is  to  develop  a  better  understanding  of  the  nature  of 
the  optimal  messages  to  be  transmitted  to  a  central  command  station  (or  fusion  center)  by  a  set  of 
agents  (or  sensors)  who  receive  different  information  on  their  environment.  In  particular,  we  are 
interested  in  solutions  of  this  problem  which  are  tractable  from  the  computational  point  of  view. 
Progress  in  this  direction  has  been  made  by  studying  the  case  of  a  large  number  of  agents. 
Normative/prescriptive  solutions  are  sought 

Problem  Statement:  Let  Hq  and  Hj  be  two  alternative  hypotheses  on  the  state  of  the  environment 
and  let  there  be  N  agents  (e.g.  intelligent  sensors)  who  possess  some  stochastic  information 
related  to  the  state  of  the  environment.  In  particular,  we  assume  that  each  agent  i  observes  a 
random  variable  y,  with  known  conditional  distribution  P(yjlHj),  j  =  0,  1,  given  either 

hypothesis.  We  assume  that  all  agents  have  information  of  the  same  quality,  that  is,  the  random 
variables  are  identically  distributed.  Each  agent  transmits  a  binary  message  to  a  central  fusion 
center,  based  on  his  information  y^.  The  fusion  center  then  takes  into  account  all  messages  it  has 

received  to  declare  hypothesis  H0  or  Hj  true.  The  problem  consists  of  determining  the  optimal 
strategies  of  the  agents  as  far  as  their  choice  of  message  is  concerned.  This  problem  has  been 
long  recognized  as  a  prototype  problem  in  team  decision  theory:  It  is  simple  enough  so  that 
analysis  may  be  feasible,  but  also  rich  enough  to  allow  nontrivial  insights  into  optimal  team 
decision  making  under  uncertainty. 

Results:  This  problem  has  been  studied  by  Prof.  J.  Tsitsiklis.  Past  results  [1-2]  can  be 
summarized  as  follows:  Under  the  assumption  that  the  random  variables  yj  are  conditionally 

independent  (given  either  hypothesis),  it  is  known  that  each  agent  should  choose  his  message 
based  on  a  likelihood  ratio  test.  Nevertheless,  we  have  constructed  examples  which  show  that 
even  though  there  is  a  perfect  symmetry  in  the  problem,  it  is  optimal  to  have  different  agents  use 
different  thresholds  in  their  likelihood  ratio  tests.  This  is  an  unfortunate  situation,  because  it 
severely  complicates  the  numerical  solution  of  the  problem  (that  is,  the  explicit  computation  of  the 
decision  threshold  of  each  agent).  Still,  we  have  shown  that  in  the  limit,  as  the  number  of  agents 
becomes  large,  it  is  asymptotically  optimal  to  have  each  agent  use  the  same  threshold. 
Furthermore,  there  is  a  simple  effective  computational  procedure  for  evaluating  this  single  optimal 
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threshold. 

We  have  also  shown  that  if  each  agent  is  to  transmit  K-valued,  as  opposed  to  binary  messages, 
then  still  each  agent  should  use  the  same  decision  rule,  when  the  number  of  agents  is  large. 
Unfortunately,  however,  the  computation  of  this  particular  decision  rule  becomes  increasingly 
harder  as  K  increases. 

We  have  also  investigated  the  case  of  M-ary  (M  >  2)  hypothesis  testing  and  constructed  examples 
showing  that  it  is  better  to  have  different  agents  use  different  decision  rules,  even  in  the  limit  as 

N— >  Nevertheless,  we  have  shown  that  the  optimal  set  of  decision  rules  is  not  completely 
arbitrary.  In  particular,  it  is  optimal  to  partition  the  set  of  agents  into  at  most  M(M-l)/2  groups 
and,  for  each  group,  each  agent  should  use  the  same  decision  rule.  The  decision  rule 
corresponding  to  each  group  and  the  proportion  of  the  agents  assigned  to  each  group  may  be 
determined  by  solving  a  linear  programming  problem,  at  least  in  the  case  where  the  set  of  possible 
observations  by  each  agent  is  finite. 

Finally,  results  have  been  obtained  which  cover  the  Neyman-Pearson  (as  opposed  to  Bayesian) 
version  of  the  problem,  in  the  case  of  M=2  hypothesis.  The  asymptotically  optimal  solution  has 
been  found  and  involves  the  Kullback-Liebler  information  distance. 

Currently,  research  is  being  carried  out  by  Prof.  J.  Tsitsiklis  and  a  graduate  student,  Mr.  George 
Polychronopoulos,  and  involves  the  following  two  directions. 

(a)  We  have  considered  a  class  of  symmetic  detection  problems  in  which  given  any  hypothesis 
Hj,  each  sensor  has  probability  E  of  making  an  observation  indicating  that  some  other 
hypothesis  Hj  is  true.  A  simple  numerical  procedure  has  been  found  which  completely  solves 
this  problem.  Furthermore,  a  closed  form  formula  for  the  optimal  decision  rules  has  been 

found  for  the  case  where  the  "noise  intensity”  £  is  very  small. 

(b)  In  the  context  of  the  above  symmetric  problem  we  have  posed  problems  of  the  following  type: 
"Is  it  preferable  to  have  N  sensors  each  one  transmitting  D  bits,  or  N/K  sensors,  each  one 
transmitting  KD  bits?  A  complete  solution  has  been  found.  The  formulation  represents  a 
fundamental  design  problem  in  the  design  of  distributed  sensor  systems. 
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We  have  also  conducted  research  which  addresses  the  issue  of  the  validity  of  asymptotic 
considerations,  when  the  number  of  agents  N  is  moderate  (N=5),  with  encouraging  results. 

The  above  results  will  be  reported  in  the  Masters  thesis  of  Mr.  Polychronopoulos  (expected  in  the 
spring  of  1988)  and  on  a  subsequent  journal  paper. 

Documentation 

[1]  J.  N.  Tsitsiklis,  "On  Threshold  Rules  in  Decentralized  Detection,"  Proc.  25th  IEEE 
Conference  on  Decision  and  Control,  Athens,  Greece,  December  1986;  also  LIDS-P-1570, 
Laboratory  for  Information  and  Decision  Systems,  MTT,  Cambridge,  MA,  June  1986. 

[2]  J.  N.  Tsitsiklis,  "Decentralized  Detection  by  a  Large  Number  of  Sensors,"  LIDS-P-1662, 
April  1987;  to  appear  in  Mathematics  of  Control,  Signals  and  Systems,  1988. 

3.3  COMMUNICATION  REQUIREMENTS  OF  DIVISIONALIZED 
ORGANIZATIONS 

Background:  In  typical  organizations,  the  overall  performance  cannot  be  evaluated  simply  in 
terms  of  the  performance  of  each  subdivision,  as  there  may  be  nontrivial  coupling  effects  between 
distinct  subdivisions.  These  couplings  have  to  be  taken  explicitly  into  account;  one  way  of  doing 
so  is  to  assign  to  the  decisionmaker  associated  with  the  operation  of  each  division  a  cost  function 
which  reflects  the  coupling  of  his  own  division  with  the  remaining  divisions.  Still,  there  is  some 
freedom  in  such  a  procedure:  For  any  two  divisions  A  and  B  it  may  be  the  responsibility  of  either 
decisionmaker  A  or  decisionmaker  B  to  ensure  that  the  interaction  does  not  deteriorate  the 
performance  of  the  organization.  Of  course,  the  decisionmaker  in  charge  of  those  interactions 
needs  to  be  informed  about  the  actions  of  the  other  decisionmaker.  This  leads  to  the  following 
problem.  Given  a  divisionalized  organization  and  an  associated  organizational  cost  function, 
assign  cost  functions  to  each  division  of  the  organization  so  that  the  following  two  goals  are  met: 
a)  the  costs  due  to  the  interaction  between  different  divisions  are  fully  accounted  for  by  the 
subcosts  of  each  division;  b)  the  communication  interface  requirements  between  different 
divisions  are  small. 

In  order  to  assess  the  communication  requirements  of  a  particular  assignment  of  costs  to 
divisions,  we  take  the  view  that  the  decisionmakers  may  be  modeled  as  boundedly  rational 
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individuals,  that  their  decisionmaking  process  consists  of  a  sequence  of  adjustments  of  their 
decisions  in  a  direction  of  decreasing  costs,  while  exchanging  their  tentative  decisions  with  other 
decisionmakers  who  have  an  interest  in  those  decisions.  We  then  require  that  there  are  enough 
communications  so  that  this  iterative  process  converges  to  an  organizationally  optimal  set  of 
decisions. 

Problem  Statement:  Consider  an  organization  with  N  divisions  and  an  associated  cost  function 
J(xj,...,xj^),  where  Xj  is  the  set  of  decisions  taken  at  the  i-th  division.  Alternatively,  Xj  may  be 
viewed  as  the  mode  of  operation  of  the  i-th  division.  The  objective  is  to  have  the  organization 
operating  at  a  set  of  decisions  (x, . xj^t)  which  are  globally  optimal,  in  the  sense  that  they 

minimize  the  organizational  cost  J.  We  associate  with  each  division  a  decisionmaker  DMj,  who  is 

in  charge  of  adjusting  the  decision  variables  Xj.  We  model  the  decisionmakers  as  "boundedly 

rational"  individuals;  mathematically,  this  is  translated  to  the  assumption  that  each  decisionmaker 
will  slowly  and  iteratively  adjust  his  decisions  in  a  direction  which  reduces  the  organizational 
costs.  Furthermore,  each  decisionmaker  does  so  based  only  on  partial  knowledge  of  the 
organizational  cost,  together  with  messages  received  from  other  decisionmakers. 


Consider  a  partition  J(x, . xj^)  =  L  J1(x1....,xj^)  of  the  organizational  cost.  Each  subcost  J1 

i=l 

reflects  the  cost  incurred  to  the  i-th  division  and  in  principle  should  depend  primarily  on  Xj  and 
only  on  a  few  of  the  remaining  xj’s.  We  then  postulate  that  the  decisionmakers  adjust  their 
decisions  by  means  of  the  following  process  (algorithm): 

(a)  DM,  keeps  a  vector  x  with  his  estimates  of  the  current  decision  of  the  other  decision¬ 
makers;  also  a  vector  X  with  estimates  of  X^  j=  dJ^/9xj,  for  k  *■  i.  (Notice  that  this  partial 
derivative  may  be  interpreted  as  DMj's  perception  of  how  his  decisions  affect  the  costs 
incurred  to  the  other  divisions. 


(b)  Once  in  a  while  DMi  updates  his  decision  using  the  rule  xj:  =  Xj  =  X  X^ ,  (y  is  a  small 

k=l 


'  v 
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positive  scalar)  which  is  just  the  usual  gradient  algorithm. 

(c)  Once  in  a  while  DMj  transmit  his  current  decision  to  other  decisionmakers. 

(d)  Other  decisionmakers  reply  to  DMj,  by  sending  an  updated  value  of  the  partial  derivative 

9Jk/9xj. 


It  is  not  hard  to  see  that  for  the  above  procedure  to  work  it  is  not  necessary  that  all  DM's 
communicate  to  each  other.  In  particular,  if  the  subcost  Jj  depends  only  on  Xj,  for  i,  there  would 
be  no  need  for  any  communication  whatsoever.  The  required  communications  are  in  fact 
determined  by  the  sparsity  structure  of  the  Hessian  matrix  of  the  subcost  functions  Jj  Recall  now 

that  all  that  is  given  is  the  original  cost  function  J;  we  therefore,  have  freedom  in  choosing  the  Jj's 

and  we  should  be  able  to  do  this  in  a  way  that  introduces  minimal  communication  requirements; 
that  is,  we  want  to  minimize  the  number  of  pairs  of  decisionmakers  who  need  to  communicate  to 
each  other. 

Progress  to  Date:  A  graduate  student,  C.  Lee,  supervised  Prof.  J.  Tsitsiklis,  undertook  the  task 
of  formulating  the  problem  of  finding  partitions  that  minimize  the  number  of  pairs  of  DM's  who 
need  to  communicate  to  each  other,  as  the  topic  of  his  SM  research.  It  was  realized  that  with  a 
naive  formulation  the  optimal  allocation  of  responsibilities,  imposing  minimal  communication 
requirements,  corresponds  to  the  centralization  of  authority.  Thus,  in  order  to  obtain  more 
realistic  and  meaningful  problems  we  did  incorporate  a  constraint  requiring  that  no  agent  should 
be  overloaded.  A  number  of  results  have  been  obtained  for  a  class  of  combinatorial  problems, 
corresponding  to  the  problem  of  optimal  organizational  design,  under  limited  communications.  In 
particular  certain  cases  were  solved;  other  cases  have  been  successfully  reformulated  as  linear 
network  flow  or  assignment  problems,  for  which  efficient  algorithms  are  known,  and  finally, 
somes  cases  were  shown  to  be  intractable  combinatorial  problems  (NP-complcte). 

This  line  of  research  is  now  essentially  complete.  Most  results  have  been  reported  in  the  Maters 
thesis  of  Mr.  C.  Lee  [1].  A  journal  paper  will  be  prepared  in  the  next  few  months  covering  both 
the  philosophical  and  the  technical  aspects  of  this  work. 

Documentation: 

[1]  C.  Lee,  "Task  Allocation  for  Efficient  Performance  of  a  Decentralized  Organization, 
LIDS-TH-1706,  S.M.  Thesis,  Laboratory  for  Information  and  Decision  Systems,  MIT, 
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Cambridge,  MA,  September  1987. 

3.4  COMMUNICATION  COMPLEXITY  IN  DISTRIBUTED  PROBLEM 
SOLVING 


Background:  The  objective  of  this  research  effort  is  to  quantify  the  minimal  amount  of 
information  that  has  to  be  exchanged  in  an  organization,  subject  to  the  requirement  that  a  certain 
goal  is  accomplished,  such  as  the  minimization  of  an  organizational  cost  function.  The  problem 
becomes  interesting  and  relevant  under  the  assumption  that  no  member  of  the  organization 
"knows"  the  entire  function  being  minimized,  but  rather  each  agent  has  knowledge  of  only  a  piece 
of  the  cost  function.  A  normative/prescriptive  solution  is  sought. 


Problem  Formulation:  Let  f  and  g  be  convex  function  of  n  variables.  Suppose  that  each  one  of 
two  agents  (or  decisionmakers)  knows  the  function  f  (respectively  g) ,  in  the  sense  that  he  is  able 
to  compute  instantly  any  quantities  associated  with  this  function.  The  two  agents  are  to  exchange 
a  number  of  binary  messages  until  they  are  able  to  determine  a  point  x  such  that  f(x)  +  g(x) 


comes  within  e  of  the  minimum  of  f+g,  where  E  is  some  prespecified  accuracy.  The  objective  is 


to  determine  the  minimum  number  of  such  messages  that  have  to  be  exchanged,  as  a  function  of  £ 
and  to  determine  communication  protocols  which  use  no  more  messages  than  the  minimum 
amount  required. 

Results:  Several  variations  of  this  problem  have  been  studied  and  solved  by  Professor  J.  Tsitsiklis 
and  a  graduate  student  Zhi-Quan  Luo.  Results  have  been  reported  in  1 1]. 

An  interest  ng  Qualitative  feature  of  the  communication-optimal  algorithms  discovered  thus  far  is 
the  following:  It  is  optimal  to  transmit  aggregate  information  (the  most  significant  bits  of  the 
gradient  of  the  function  optimized)  in  the  beginning;  then,  as  the  optimum  is  approached  more 
refined  information  should  be  transferred.  This  very  intuitive  result  seems  to  correspond  to 
realistic  situtations  in  human  decisionmaking. 

More  recently,  we  have  considered  a  new  formulation  in  which  the  messages  are  real-valued, 
rather  than  discrete.  A  prototype  problem  is  to  assume  that  each  one  of  two  agents  knows  a  n  x  n 
matrix  Aj,  i  =  1,2.  The  objective  is  to  compute  a  particular  entry  of  (Aj-t^)'^.  This  problem 

arises,  for  example  in  distributed  optimization  of  a  cost  function  of  the  form  x'Ajx+x  A2X+x'b. 
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An  obvious  solution  is  for  agent  1  to  transmit  all  of  the  entries  of  Aj  to  agent  2  who  then 

performs  the  required  computations.  This  scheme  requires  n^  communications.  We  have 
succeeded  in  showing  that  there  exists  no  method  which  will  do  with  fewer  than  O(n^) 
communications.  That  is  information  must  be  centralized.  On  the  technical  side,  we  have 
restricted  to  communication  protocols  which  are  smooth  rational  functions  of  the  original  data  Aj, 

Aj.  (Otherwise  n^  numbers  could  have  been  coded  in  a  single  real  number).  The  proof  of  our 
result  uses  novel  techniques  and  makes  use  of  certain  results  in  algebraic  geometry. 


Documentation: 

[1]  J.  N.  Tsitsiklis  and  Z.-Q.  Luo,  "Communication  Complexity  of  Convex  Optimization," 
LIDS-P-1617,  Laboratory  for  Information  and  Decision  Systems,  MTT, October  1986;  Proc. 
25th  IEEE  Conference  on  Decision  and  Control,  Athens,  Greece,  December,  1986;  also  an 
invited  talk  was  given  at  the  2nd  Symposium  on  Complexity  of  Approximately  Solved 
Problems,  Columbia  University,  New  York,  April  1987;  also.  Journal  of  Complexity,  3, 
1987,  pp.  231-243. 


3.5  DISTRIBUTED  ORGANIZATION  DESIGN 

Background:  The  bounded  rationality  of  human  decisionmakers  and  the  complexities  of  the  tasks 
they  must  perform  mandate  the  formation  of  organizations.  Organizational  architectures  distribute 
the  decisionmaking  workload  among  the  members:  different  architectures  impose  different 
individual  loads  and  result  in  different  organizational  performance.  Two  measures  of 
organizational  performance  are  accuracy  and  timeliness.  The  first  measure  of  performance 
addresses  in  pan  the  quality  of  the  organization's  response.  The  second  measure  reflects  the  fact 
that  in  tactical  decisionmaking  when  a  response  is  generated  is  also  significant:  the  ability  of  an 
organization  to  carry  out  tasks  in  a  timely  manner  is  a  determinant  factor  of  effectiveness. 

The  scope  of  work  was  divided  into  three  tasks: 

(a)  Evaluation  of  Alternative  Organizational  Architectures; 

(b)  Asynchronous  Protocols;  and 

(c)  Information  Support  Structures. 

During  this  year,  the  research  effort  has  been  organized  around  three  foci.  In  the  first  one,  we 
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continue  to  work  on  the  development  of  analytical  and  algorithmic  tools  for  the  analysis  and 
design  of  organizations.  In  the  second,  we  are  integrating  the  results  obtained  thus  far  through  the 
development  of  a  workstation  for  the  design  and  analysis  of  alternative  organizational 
architectures.  Finally,  the  experimental  program,  initiated  last  year  with  the  objective  of  collecting 
data  necessary  to  calibrate  the  models  and  evaluate  different  architectures  for  distributed 
decisionmaking,  has  been  continuing  and  is  expanding. 

3.5.1  Design  and  Evaluation  of  Alternative  Organizational  Architectures. 

In  order  to  design  an  organization  that  meets  some  performance  requirements,  we  need  to  be  able 
to  do  the  following: 

(a)  Articulate  the  requirements  in  qualitative  and  quantitative  terms; 

(b)  Generate  candidate  architectures  that  meet  some  of  the  requirements; 

(c)  Evaluate  the  candidate  organizations  with  respect  to  the  remaining  requirements; 

(d)  Modify  the  designs  so  as  to  improve  the  effectiveness  of  the  organization; 

The  generalized  Performance  Workload  locus  has  been  used  as  the  means  for  expressing  both  the 
requirements  that  the  organization  designer  must  meet  and  the  performance  characteristics  of  any 
specific  design.  Consider  an  organization  with  N  decisionmakers.  Then  the  Performance 
Workload  space  is  an  N+2  dimensional  space  in  which  two  of  the  dimensions  correspond  to  the 
measures  of  the  organization's  performance  (say,  accuracy  and  timeliness)  and  the  remaining  N 
dimensions  correspond  to  the  measure  of  the  workload  of  each  individual  decisionmaker.  Two 
loci  can  be  defined.  First,  the  Requirements  locus  is  the  set  of  points  in  this  N+2  dimensional 
space  that  satisfy  the  performance  and  workload  requirements  associated  with  the  task  to  be 
performed  by  the  organization.  The  second,  the  System  locus,  is  the  set  of  points  that  arc 
achievable  by  a  particular  design.  The  design  problem  can  then  be  conceptualized  as  the  reshaping 
and  repositioning  of  the  System  locus  in  the  Performance  Workload  space  so  that  the 
requirements  are  met. 

Three  thesis  projects  were  completed  during  this  period.  The  individual  problem  statements  and  a 
decription  of  the  results  follow: 
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Modeling  and  Evaluation  of  Variable  Structure  Organizations 

Problem  Statement:  Develop  a  methodology  for  modeling  and  analyzing  classes  of 
variable-structure  organizations,  i.e.,  organizations  where  the  interactions  between  decision 
makers  can  change.  These  organizations,  named  VDMO  from  now  on,  can  be  classified 
according  to  what  factors  trigger  the  change.  Three  types  have  been  defined: 

Type  1  Variability:  The  VDMO  adapts  the  structure  of  its  interactions  to  the 
input  it  processes. 

-  Type  2  Variability:  The  VDMO  adapts  the  structure  of  its  interactions  to 
changes  in  the  environment  in  which  it  functions. 

Type  3  Variability:  The  VDMO  adapts  the  structure  of  its  interactions  to 
changes  in  its  own  components.  For  instance,  it  can  reconfigure  itself  to 
perform  its  task  when  its  resource  availability  has  changed. 

In  both  Type  2  and  3  VDMOs,  the  issue  of  the  detection  by  the  organization  that  a  change  has 
occurred  has  not  been  addressed.  These  three  types  of  variability  can  exist  concurrently  in  a  given 
organization;  however,  for  their  analysis  and  for  the  evaluation  of  their  effects  on  system 
performance,  they  have  been  treated  separately. 

Progress  to  Date:  This  problem  was  addressed  by  Jean-Marc  Monguillet  under  the  supervision  of 
Dr.  A.  H.  Levis.  The  focus  of  the  research  effort  has  been  the  modeling  and  analysis  of  variable 
structure  organizations  using  Predicate  Transition  Nets. 

The  System  Effectiveness  Analysis  methodology  has  been  extended  to  account  for  variable 
structure  organizations.  A  Measure  of  Effectiveness  has  been  proposed  for  each  type  of  variable 
DMO.  A  mathematical  formulation  for  the  computation  of  that  MOE  has  been  established. 

A  modeling  methodology  has  been  described  providing  a  representation  of  DMO's  by  functions. 
The  main  features  of  that  methodology  is  the  decoupling  between  the  pattern  of  interactions  and 
the  identity  of  decisionmakers,  who  are  modeled  by  tokens  and  treated  like  any  other  resources. 
The  Predicate  Transition  Nets  formalism  has  been  adapted  to  allow  such  representation. 

An  example  illustrating  the  overall  procedure  has  been  developed.  It  consists  of  three  candidate 
designs  for  an  air  defense  task.  Each  of  these  candidates  is  composed  of  three  decisionmakers, 
namely  one  Headquarters  and  two  Field  Units.  Two  organizations  have  a  fixed  structure,  and  the 
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third  one  is  type  1  variable;  for  some  tasks,  it  adapts  the  pattern  of  interactions  to  a  pattern 
comparable  to  that  of  the  first  fixed  structure  DMO.  For  some  others,  it  takes  the  other  pattern. 
The  results  of  the  comparison  of  these  designs  are  that  a  particular  organizational  design  cannot  be 
selected  in  general  on  the  basis  of  its  performance  characteristics  alone,  as  presented  in  the  form 
of  a  system  locus.  The  Effectiveness  of  each  candidate  has  to  be  evaluated  quantitatively  for  each 
set  of  mission  requirements;  then  zones  can  be  defined  in  the  requirements  space  which 
characterize  for  each  organization  the  ranges  of  mission  requirements  for  which  it  is  the  most 
effective.  In  that  particular  case,  the  set  of  mission  requirements  for  which  the  variable  structure 
organization  has  the  highest  Effectiveness  can  be  computed.  It  has  been  shown  clearly  that  a 
variable  structure  organization  was  preferable  to  the  fixed  structure  ones  when  the  requirements 
were  such  that  one  fixed  design  was  not  timely  enough,  whereas  the  other  was  not  accurate 
enough.  Type  1  variability  provided  a  compromise  between  extreme  performance  of 
organizations  with  fixed  structure. 

Documentation:  The  thesis  of  J.-M.  Monguillet  has  been  issued  as  a  LIDS  report. 

[1]  Jean-Marc  Monguillet,  "Modeling  and  Evaluation  of  Variable  structure  Organizations,"  S.M. 
Thesis,  Report  L1DS-TH-1730,  Laboratory  for  Information  and  Decision  Systems,  MIT, 
Cambridge,  MA,  December  1987. 


Design  of  Organizations 

Objective:  Given  a  feasible  organizational  architecture,  develop  a  methodology  for  (a)  identifying 
the  functions  that  must  be  performed  by  the  organization  in  order  that  the  task  be  accomplished, 
(b)  selecting  the  resources  (human,  hardware,  software)  that  are  required  to  implement  these 
functions,  and  (c)  integrating  these  resources  -  through  interactions  -  so  that  the  system  operates 
effectively. 

Progress  to  Date:  This  research  problem  has  been  investigated  by  Stamos  K.  Andreadakis  under 
the  supervision  of  Dr.  A.  H.  Levis.  A  doctoral  thesis  has  been  defended  successfully  in 
December  and  the  dissertation  is  in  the  final  stages  of  preparation.  The  results  of  this  task  will  be 
reported  in  the  next  progress  report. 
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Documentation: 

[1]  A.  H.  Levis  and  S.K.  Andreadakis,  "Computer-Aided  Analysis  of  Organizations,"  Proc. 
25th  IEEE  Conference  on  Decision  and  Control,  Athens,  Greece,  December  1986. 

[2]  S.K.  Andreadakis  and  A.  H.  Levis,  "Accuracy  and  Timeliness  in  Decision-Making 
Organizations,"  Proc.  10th  IF  AC  World  Congress,  July  27-31,  1987,  Munich,  FRG  and 
Proc.  9th  MITIONR  Workshop  on  C3  Systems,  L1DS-R-1624,  MIT  Cambridge,  MA, 
December  1986. 

[3]  S.  K.  Andreadakis  and  A.  H.  Levis,  "Design  Methodology  for  Decision-Making 
Organizations,"  Proc.  of  1987  Symposium  on  &  Research,  National  Defense  University, 
Washington  DC,  June  1987. 


Performance  Evaluation  of  Organizations  with  Decision  Aids 


Problem  Statement:  Analyze  and  evaluate  the  impact  of  decision  aids,  i.e.,  preprocessors  and 
decision  support  systems,  on  the  effectiveness  of  decisionmaking  and  information  processing 
organizations.  In  particular,  investigate  the  concept  of  coordination  of  decisionmakers  assisted  by 
those  decision  aids. 


Progress  to  Date:  A  Master's  Thesis  has  been  completed  by  Jean-Louis  Grevet  under  the 
supervision  of  Dr.  A.H.  Levis.  From  a  conceptual  standpoint,  the  idea  of  coordination  in 
decision-making  organizations  embodies  three  classes  of  issues: 

-  the  extent  to  which  the  decisionmakers  constitute  a  team. 

-  the  synchronization  of  the  decisionmakers’  activities  . 

-  the  consistency  of  the  information  processed  by  the  different  members 
of  the  organization. 

The  latter  class  of  issues  is  primarily  related  to  the  fact  that  decisionmakers  do  not  necessarily 
process  data  that  are  consistent  because  they  have  different  geographical  or  temporal  origins  :  For 
instance,  two  different  decisionmakers  can  process  data  originating  from  different  sensors  or 
different  databases  as  well  as  data  originating  from  a  common  database  but  accessed  at  different 
instants. 


The  work  focused  on  these  three  of  issues: 
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(a)  The  concept  of  team  of  decision-makers  has  been  clarified.  A  team  of  decisionmakers  is 
defined  as  being  an  organization  in  which  the  members  : 

-  have  a  common  goal 

-  have  the  same  interests  and  same  perception  of  the  environment 

-  have  activities  which  must  be  coordinated  so  that  they  achieve  a  higher  performance. 

Thus,  for  a  task  X  with  probability  distribution  p(X)  and  a  cost  function  c(X)  for  the 
organization,  one  condition  for  the  organization  to  be  a  team  is  that  its  members  have  the  same 
perception  of  the  task,  pT(X),  i.e.  the  same  beliefs  about  the  task,  and  assign  the  same  cost 

CjfX)  to  each  input,  i.e.,  have  the  same  interests  as  far  as  the  task  is  concerned. 

The  team  will  account  perfectly  for  the  organizational  objectives  when: 
pT(X)  =  p(X)  and  cT(X)  =  c{X) 

(b)  The  issue  of  synchronization  is  related  to  the  interactions  between  the  decisionmakers  that 
take  place  during  the  decisionmaking  process.  It  is  thus  a  dynamic  characteristic  of  the 
organization.  When  a  decisionmaker  DMj  processes  some  information,  the  total  processing 

time  of  this  input  for  DM,  consists  of  two  distinct  parts: 

-  the  time  Tt  during  which  the  decisionmaker  actually  processes  the  information 

-  the  time  Tp  spent  by  the  information  in  the  memory  of  the  decisionmaker 
without  being  processed. 

The  time  Tp  is  the  result  of  two  factors: 

-  information  can  remain  in  the  memory  of  the  decisionmaker  until  he  decides  to  process 
it  using  the  relevant  algorithm.  In  this  case,  the  decisionmaker  processes  several 
pieces  of  information  at  the  same  time.  Since  a  particular  algorithm  cannot  process  two 
inputs  at  the  same  time,  some  inputs  will  have  to  remain  unprocessed  in  memory 
waiting  until  he  relevant  algorithm  is  free. 

-  Information  can  also  stay  in  memory  because  the  decisionmaker  waits  to  receive  a 
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necessary  piece  of  information  from  another  decisionmaker  or  a  decision  support 
I  system. 

An  organization  is  perfectly  synchronized  when,  for  the  whole  decisionmaking  process,  the 
decisionmakers  do  not  have  to  wait  for  information  that  they  need  in  order  to  process  the 
information  that  is  in  memory.  The  synchronization  degrades  when  the  processing  of  some  inputs 
'  leads  decisionmakers  to  wait  for  these  data. 

Synchronization  is  an  important  concept  because  the  processing  of  information  introduces  three 
kinds  of  baises: 

-  biases  due  to  the  uncertainty  embodied  in  the  information  processe; 

-  biases  due  to  tthe  models  used;  and 

-  biases  due  to  the  value  of  information  when  the  decisionmaker  actually  processes  it. 

If  an  item  of  information  remains  in  memory  for  along  time,  the  decisionmaker  might  well  attach 
less  value  to  it  when  he  actually  processes  it.  This  could  lead  to  a  degration  of  the  effectiveness  of 
the  organization. 

I 

(b)  The  consistency  of  information  refers  to  whether  or  not  different  items  of  data  can  be  fused 
together  without  contraction.  It  is  mission  dependent.  Data  can  be  inconsistent  if  they  have 
different  geographical  or  temporal  origins:  For  instance,  two  different  decisionmakers  can 
process  data  originating  from  different  sensors  or  different  databases  as  well  as  data 
originating  from  a  common  database  but  accessed  at  different  times. 

The  modeling  of  decisionmaking  processes  that  require  coordination  has  been  completed  using  the 
Predicate  Transition  Nets  formalism.  The  tokens,  which  are  the  symbolic  information  carrier,  are 
identified  by  three  attributes: 

-  the  time  if  entry  in  the  net,  Tn. 

-  the  time  of  entry  in  a  specific  place,  T^. 

-  the  class  c  assigned  to  information  items  by  the  previous  processing  stage. 

The  rule  of  enablement  of  transitions  is  that  the  tokens  in  the  input  places  must  have  the  same 
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attribute  Tn.  It  means  that,  when  decisionmakers  interact,  they  must  refer  to  the  same  input. 


Two  measures  that  can  be  used  for  evaluating  the  coordination  in  decisionmaking  organizations 
have  been  defined: 

-  the  degree  of  information  consistency,  D. 

-  the  measure  of  synchronization,  S-p. 

A  simulation  program  of  Predicate  Transition  Nets  have  been  developed  using  the  Design  Open 
Architecture  Development  Systems  of  Meta  Software  Corp.,  has  been  developed.  It  can  be  used 
to  get  insight  in  the  dynamics  of  decisionmaking  process. 

The  impact  of  decision-aids  on  the  coordination  of  decisionmaking  organizations  has  been 
assessed  using  the  modeling  and  evaluation  tools  described  above.  A  model  of  a  decisionmaker 
assisted  by  a  decision  support  system  has  been  proposed.  It  accounts  for  the  fact  that  most  real 
systems  contain  both  elements  of  centralization  and  decentralization,  i.e.,  the  users  can  share 
certain  resources  -  centralized  databases  or  mainframes  -  and  access  individually  other  facilities 
such  as  intelligent  terminals.  This  modifies  the  strategy  of  each  decisionmaker  who  must  integrate 
in  his  choices  the  possibility  of  requesting  information  from  the  DSS.  Thus,  each  decisionmaker 
has  three  alternatives  vis-a-vis  the  DSS: 

-  he  can  ignore  it  and  process  the  information  by  himself. 

-  he  can  query  it  and  rely  totally  on  the  response. 

-  he  can  query  it  and  compare  the  response  to  his  own  perception  of  the  issue. 

The  evaluation  of  these  choices  has  been  carried  out  on  an  example,  a  two-person  hierarchical 
organization. 

It  has  been  found  that  decision-aids  can  modify  the  coondiantion  of  decisionmaking  process  by: 

-  modifying  a  priority  order  with  which  different  organization  members 
process  the  inputs. 

-  increasing  the  number  of  information  flow  paths  with  different  processing  times. 
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Azzola,  V.  Jin,  J.  Kyratzoglou  and  J.  Papastavrou  attended  the  Annual  Review  of  the  DTDM 
program  organized  by  the  Office  of  Naval  Research. 


Dr.  Levis  attended  the  5th  Annual  Workshop  on  Command  and  Control  Decision  Aiding  where  he 
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lif~  r- :  fi.  Let  C<  ’  .  L  .  > .  m  be  the  total  number  of  messages  that  are 
exchanged:  this  is  a  function  of  the  particular  protocol  being  employed 
and  we  are  looking  for  an  optima!  one.  More  precisely,  let 

(  ■  t :  i  .  -i  =  sup  O  f  .  /; :  r.  ri  (  I .  I  i 
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he  the  communication  requirement  (in  the  worst  case)  of  the  particular 
protocol  and  let 


rt  >;  ,)  =  inf  C't.r :  r.  rrl  1 1 ,2l 

r  •  !  i  ■ 

he  the  communication  requirement  under  an  optimal  protocol,  where  Ihn 
is  the  class  of  all  protocols  w  hich  w  ork  properly .  for  a  particular  choice  of 
e.  The  quantity  Cl  » ;  r  1  may  he  called  the  e-n>ntninnication  ctmiplcxiry  of 
the  above-defined  problem  of  distributed,  approximate,  convex  optimi¬ 
zation. 

For  the  abov  e  definition  to  he  precise,  we  need  to  he  specific  regarding 
the  notion  of  a  protocol :  that  is.  we  huv  e  to  specify  the  set  I  hr  i  of  admissi¬ 
ble  protocols  and  this  is  what  we  do  next.  A  protocol  —  consists  of 

(a)  A  termination  time  7; 

(h)  A  collection  of  functions  A/  :  t  ■  jo.  |j  — •  JO.  l}./=  1.2. 

t  =  U.  I.  2 . 7  -  1: 

(c)  A  final  function  O  :  t  ■  Jo.  Ij  ;  —  |0.  If. 

A  protocol  corresponds  to  the  following  sequence  of  events.  Each  pro¬ 
cessor  P,  receives  its  "input"  /  and  then,  at  each  time  /  transmits  to  the 
other  processor  P  a  binary  message  m  t/i  determined  by 

m  in  =  M  a  /  .  in  ttii . m  u  -  I)). 

Thus  the  message  transmuted  hv  a  processor  depends  only  on  the  func¬ 
tion./,  known  hv  it.  together  will  all  messages  it  has  received  in  the  past. 
At  time  T  the  exchange  of  messages  ceases  and  processor  Pt  picks  a  point 
in  [0.  I]"  according  to 

,v  =  /?/;(0i . hm 7”  -  III.  (1.3) 

The  number  Ct/i ./; :  e.  ~)  of  messages  transmitted  under  this  protocol  is 
simply  2 T.  We  define  Ihei  as  the  set  of  all  protocols  w  ith  the  property  that 
the  point  ,\  generated  by  ( 1 .3)  belongs  to  /(/,  :  e).  for  every /i E  T. 

A  couple  of  remarks  on  our  definition  of  protocols  are  in  order. 

(i)  We  have  constrained  each  processor  to  transmit  exactly  one  binary 
message  at  each  stage.  This  may  be  wasteful  if.  for  example,  a  better 
protocol  may  be  found  in  w  hich  P ,  first  sends  many  messages  and  then  P: 
transmit  its  own  messages.  Nevertheless,  the  w'aste  that  results  can  be  at 
most  a  factor  of  two.  Since,  in  this  paper,  we  study  only  orders  of  magni¬ 
tude.  this  tissue  is  unimportant. 

(ii)  We  have  assumed  that  the  termination  time  T  is  the  same  for  all /, . 
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f:  E  .t .  even  though  lor  certain  "easy"  functions  the  desired  result  may 
have  been  obtained  earlier.  Again,  this  is  of  no  concern  because  we  are 
interested  in  a  worst  case  analvsis. 


Related  Rc.U  iiri  li 


The  studs  of  communication  complexity  was  initiated  by  Abelson 
(19801  and  'V;io  (I9"79).  Abelson  deals  with  problems  of  continuous  vari¬ 
ables.  in  which  an  exact  result  is  sought,  and  allows  the  messages  to  be 
real-valued,  subject  to  a  constraint  that  they  are  smooth  functions  of  the 
input.  This  is  a  different  type  of  problem  from  ours,  because  we  are 
interested  in  an  approximate  result  and  we  Lire  assuming  binary  messages. 

Yao  (I979i  deals  with  combinatorial  problems,  in  which  messages  are 
binary  and  an  exact  result  is  obtained  after  finitely  many  stages.  This 
reference  has  been  followed  by  a  substantial  amount  of  research  which 
developed  the  theory  further  and  also  evaluated  the  communication  com¬ 
plexity  of  selected  combinatorial  problems  i Papadimitriou  and  Sipser. 
1982:  Papadimitriou  and  Tsitsiklis.  1982;  Aho  ct  al..  1983:  Pang  and  El 
Gamal.  I98P:  Mehlhorn  and  Schmidt.  1982:  Gilman.  I984i.  The  main 
application  of  this  research  h.is  been  in  VLSI,  where  communication 
complexity  constrains  the  amount  of  information  t hut  has  to  flow  from 
one  side  of  a  chip  to  the  other:  this  in  turn  determines  certain  trade-offs  on 
the  achievable  performance  of  special-purpose  VLSI  chips  for  computing 
certain  functions  i  Gilman.  198-i'. 

Finally .  communication  complexity  has  been  also  studied  for  models  of 
asynchronous  distributed  computation,  in  which  messages  may  reach 
their  destination  after  an  arbitrary  delay  (Awerbuch  and  Gallager.  1985). 

The  communication  complexity  of  the  approximate  solution  of  prob¬ 
lems  of  continuous  v  ariables  has  not  been  studied  before,  to  the  best  of 
our  knowledge.  However,  there  exists  a  large  amount  of  theory  on  the 
information  requirements  for  solving  (approximately  )  certain  problems 
such  as  nonlinear  optimization,  and  numerical  integration  of  differential 
equations  (Nemirovsky  and  Yudin.  1983;  Traub  and  Wozniakowski.  19801 
("information  based  complexity").  Here  one  raises  questions  such  as. 
How  many  gradient  evaluations  are  required  for  an  algorithm  to  find  a 
point  which  minimizes  a  convex  funolion  within  some  prespecified  accu¬ 
racy  e?  We  can  see  that,  in  this  type  of  research,  information  fiow-s  one 
way — from  a  "memory  unit"  (which  knows  the  function  being  mini¬ 
mized)  to  the  processor — and  this  is  what  makes  it  different  from  ours. 


Outline 


In  Section  II  we  establish  straightforward  lower  bounds  such  as  C(5:  e) 
s  Oin  log(  l/e)).  In  Section  III  we  show  that  the  naive  distributed  version 
of  ellipsoid-type  algorithms  leads  to  protocols  with  Odr  log(l/e)(log  n  + 
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log!  I  rll  communication  requirements  and  ne  show  that  this  upper  hound 
cannot  he  improved  substantially  within  a  restricted  class  of  protocols.  In 
Sections  IV  and  V  we  partially  close  the  cap  between  the  above-men¬ 
tioned  upper  and  lower  bounds  by  presenting  protocols  with  OtlogG  f II 
communication  requirements  for  the  case  ii  =  I  (Section  I  Viand  with  (Mn 
log  mlog  n  -  loci  I  i  'I  communication  requirements  for  the  case  of  gen- 
eral  n  (Section  V).  under  certain  regularity  assumptions  on  the  elements 
of  J.  In  Section  \'l.  we  provide  some  discussion  of  possible  extensions 
and  questions  which  remain  open. 


11.  Lovvlr  Bolsds  os  fl.Kfl 

Before  we  prove  any  lower  hounds  we  start  with  a  fairly  trivial  lemma 
w  hose  proof  is  omitted. 

Lr  mm  a  2.1.  If  3  C  S  then  Cl.  r :  r  I  i  C(  <>:  ft. 

L.et  he  the  set  of  quadratic  functions  of  the  form /til  =  |.v  -  a 

with  v*  G  |(t.  I]"  and  where  ■’  is  the  Euclidean  norm.  Also,  let  he  the 

set  of  functions  of  the  form  /(.v)  =  Jmav.i,  „  a  -  a"  .  where  Lv,“!  s  I.Vi. 

PkoI'OSIHos  2.2.  ii)  C(J,,:  fl  2  (>(/i(loe  /;  -  log(  1  fill: 
liil  0.7 :  f )  2  Odi  log!  I  i- )). 

Proof.  (i  i  Consider  a  protocol  -  6  Ihr  i  w  ith  termination  time  T  and  let 
us  study  its  operation  for  the  special  case  where  f]  =  0.  Let  S  he  the  ranee 
of  the  function  Q  corresponding  to  that  protocol  (see  Eq.  ( 1 .3)).  w  hen/,  = 
0.  Given  that  the  minimum  of/;  may  he  anywhere  in  |0.  I]".  S  must 
contain  points  w  hich  come  w  ithin  f  '  in  Euclidean  distance,  from  ev  ery 
point  m  |0.  )]\  Now.  one  needs  at  least  M/i.e1  :)Sr  Euclidean  balls  of 
radius  f  ’ :  to  cover  [0.  I]' .  where  A  and  B  are  absolute  constants.  (This 
follows  by  simply  taking  into  account  the  volume  of  a  ball  in  n-dimen- 
sional  space,  i  Therefore,  the  cardinality  of  5  is  at  least  (An  f  1 :  )H".  Giv  en 
that  the  cardinality  of  the  range  of  a  function  is  no  larger  than  the  cardinal¬ 
ity  of  its  domain,  it  follows  that  the  cardinality  of  5  is  no  larger  than  2'. 
Therefore.  T  2  OdMlog  n  -  log(l.e)>.  which  proves  the  first  part. 

up  The  proof  is  almost  identical  to  that  of  part  (i)  and  is  therefore 
omitted.  The  only  difference  is  that  now  [0.  1]"  is  covered  by  halls  in  the 
supremum  norm  and  OKI  el'  I  such  balls  are  necessary  and  sufficient.  ■ 

In  the  proof  of  Proposition  2.2  we  made  use  of  the  assumption  that  the 
final  result  is  alway  s  obtained  by  processor  P\ .  Nev  ertheless,  at  the  cost 
of  minor  complications  of  the  proof,  the  same  lower  bound  may  be  ob¬ 
tained  even  if  we  enlarge  the  class  of  allowed  protocols  so  that  the  proces¬ 
sor  who  computes  the  final  result  is  not  prespecified. 
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Lei  rst  K,  i  ("stronglv  convex  functions”)  he  the  set  of  all  continuously, 
differentiable  convex  functions./  with  the  properties 

L\.  v  -  \  £.  v  f  (a  )  -  .ft '  I  '  -  v>  •-  ML  \  -  v  (2. 1 1 

/  t  X  *  •  Ml  n' Vv  €  |(>.  IT  .  (2.2i 

(Note  that  ( 2 . 1 1  implies  that  M  2  I.)  Also,  let  be  the  set  of  convex 
functions  which  are  bounded  bv  J  and  satisfv 

m  v  i  -/i\)  ^  |  max  \  -  v  .  Vv .  \ 


Proposition  2.3.  <u  0.rM  m  i  ■  >  >  -  Gi/irlog  r;  -  loci  I  fit. 

(in  C i  f/  :  >  i  2  (>oi  logi  I  i  n 

Prm >f.  (’art  (lii  follows  from  Proposition  2.2  and  Lemma  2  I.  because 
v  I  *,  f  or  part  in.  we  note  that  ”  fs  ,■  :  and  1  emma  2  I  proves 
the  result  for  .*v  v  :  -  The  result  tor  general  /  follows  because  anv  I  ~ 
fc  mi  can  be  scaled  so  that  it  belongs  to  *f  v  ■ 


III  N  MX  I  l  CPI  K  Hr  ll  M)S 

\\  e  consider  here  a  strnightforw aid  distributed  v  ersion  of  the  method  of 
the  centers  of  grav  nv  i  \1C(>  i.  vv  hich  has  been  show  n  bv  Ncmirovskv  on J 
Yudin  ( 19, S3)  to  be  an  optimal  algorithm  in  the  single-processor  case,  for 
functions  m  f  .  in  the  sense  that  it  requires  a  minimal  number  of  gradient 
evaluations.  This  method  mav  be  viewed  as  a  generalization  of  the  well- 
known  ellipsoid  algorithm  for  linear  programming  (Papadimitriou  and 
Steigluz.  1 982 1 .  W  e  start  bv  describing  the  uniprocessor  veision  of  this 
method  and  then  analv/e  the  communication  requirements  of  a  distrib¬ 
uted  implementation. 

Tin  MCd  Algorithm  l.V<  nunnsk\  oral  Yudin.  1983.  p.  f>2l 

Let  f  f  tj  be  ;i  convex  function  to  be  minimized  with  accuracy  r.  Let 
(i,  =  [0.  1 1'  and  let  ,\„  be  its  center  of  gravity.  At  the  beginning  of  the  k th 
stage  of  the  computation,  w e  assume  that  w  e  are  given  a  conv  ex  set  G<-  i 
C  |0.  I  ]’'  and  it'  center  of  grav  itv  ,vt .  Let  r.»  be  a  scalar  and  let  y,  be  a 
vector  in  R  with  the  following  properties: 

(il  v  -  vy)  ■s.  fix).  V.v  E  |0.  I]": 

mi  z.l  2  ./( v,  i  -  (r/'2i. 

( Note  that  if  the  term  r.  2  were  absent  in  condition  lii).  we  would  have  c,  = 
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/(a,  I  and  v,  =  f  l\k ).  if  a,  is  an  interior  point  The  presenee  of  the  t:  2  lei  m 
implies  that  these  relations  need  to  hold  only  approximately  .  I 

L.et  Hi  =  mi  n  1  and  let  6\  =  |i  £  6'.  ,, :  •  i . .  v  -  a.  ■  -s  u, }.  The 
algorithm  terminates  nhen  the  Lebesgue  volume  of  (l.  becomes  smaller 
than  if  2)'  and  returns  a  point  associated  with  the  smallest  value  of; 
encountered  so  far. 

The  follow  ing  facts  are  quoted  from  Nemirovsky  and  Yudin  (1SW3). 

<.ii  The  volume  of  ( ».  no  (.truer  ih.tn  lf*.  where  <>  i-*  an  ohsolute  eonsiani. 

smaller  than  one  anJ  independent  nl  the  dimension  //  1  hus  .«  tot.il  o\ 

nt loui  2  *  )  lofcii  I  (t  m  -  (JwilojiH  i n  st.iiies  are  sullistent 
tb)  I  he  result  i  of  the  algorithm  satisfies  /( t  \  inf  .  ,  fn  \  -  r  \  t  t  >.  where 

l  i  /  )  ~  sup..  ,  ft  u  -  inf  r  ft  i ) 

Note  that  lifts:  I.  for  f  =  t\  --  f.  I\ .  f  £  t,  so  that  the  algorithm 
indeed  produces  a  result  belonging  to  /(  I:  cl. 

We  now  consider  a  distributed  implementation  of  this  algorithm.  The 
distributed  protocol  will  consist  of  stages  corresponding  to  the  stages  of 
the  MCG  algorithm.  At  the  beginning  of  the  Ath  stage,  both  processors 
know  the  current  convex  set  O',  i  and  are  therefore  able  to  compute  its 
center  of  grav  it  v  a,  .  Processor  /’  ev  aluales  /  t.i.  i  and  transmits  the  binary 
representation  of  a  message  Ini.  A  i  satisfy  ing  /■></.  A I  €  [  /  <  1. 1  -  (r  4).  /  u. ) 

-  (r  Si).  Clearly.  Ini.  Ai  may  be  chosen  so  that  its  binarv  representation 
has  at  most  Oilogi  1  r  11  bits  Also,  each  processor  evaluates  the  gradient 

g  .  of  its  function  t  .  at  a.  (with  components  •,*.  I . n)  and 

transmits  the  binarv  representation  of  messages  c  </.  A.  / 1  satisfying  g  . 

-  c  if  A.  i  i  -  r  tlfwM.  Clearlv  the  <  li.  A. y'i's  may  be  chosen  so  that  they 
can  be  all  transmitted  using  (Hn  login  Fit  =  Out  log  n  ~  n  log*  I  f)I  bits. 

Next,  each  processor  lets  =  ft  I .  Ai  -  />( 2 .  Ai  and  lets  v.  be  the  vector 
with  components  rtl.  A.  /)  -  <  (2.  A.  /  i  It  then  follows  by  some  simple 
algebra  that  and  satisfy  the  specifications  of  the  MCG  algorithm. 
Finally .  each  processor  determines  and  its  center  of  grav  ny  i, . , .  and 
the  algorithm  proceeds  to  its  next  stuge 

We  now  combine  our  estimates  ot  the  number  of  stages  of  the  MCG 
algorithm  and  of  the  communication  requirements  per  stage  to  conclude 
the  follow  ing 

Proposition  3.1.  Ct .t t  :  >• )  (Hn-  logi  I  t  n log  n  -  log1 1  *  n.  In  /><» - 
tit  ular,  the  ahovc-dcstrihid  Ji\tnhuu  J  rm/iw  of  the  \IC(I  algorithm 
Mays  within  tins  hound. 

The  upper  bound  of  Proposition  3.1  is  quite  far  from  the  low  er  bound  of 
Proposition  2.2.  We  show  next  that  within  a  certain  class  of  protocols  this 
upper  bound  cannot  be  substantially  improved 

We  consider  protocols  which  consist  of  stages.  At  the  Ath  stage  there  is 
a  current  point  at  £  [0,  1)'  known  by  both  processors.  Then,  the  proces¬ 
sors  transmit  to  each  other  approximate  v  alues  of  /  and  of  the  gradient  of 
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/.  all  evaluated  at  i„.  L  anc  the  value'  of  these  messages,  together  with 
anv  past  common  information,  they  determine  the  next  point  v,.,.  ac¬ 
cording  to  some  commonlv  known  rule,  and  so  on.  We  place  one  addi¬ 
tional  restriction:  when  a  processor  transmits  an  appioximate  value  of 
/( i.  i  it  sloes  so  bv  transmitting  a  sequence  of  hits  of  the  binary  representa¬ 
tion  of  i'tt.1  starting  from  the  most  significant  one  and  continuing  with 
consecutive  less  significant  bits.  (So.  for  example,  a  processor  is  not 
allowed  to  transmit  the  first  and  the  third  most  significant  bits  of/tvy). 
without  transmitting  the  second  most  significant  bit  )  The  same  assump¬ 
tion  is  made  concerning  the  components  of  the  gradient  of /.  Finally .  we 
require  that  the  same  number  of  bits  of  /.t.x, )  and  of  each  component  of  the 
gradient  of  f  get  transmitted 

The  above  restrictions  turn  out  to  be  quite  severe 

Proposi  i  ion  3.2.  1  life  e\isl\  a  <  on  slant  A  uh  h  llttil  for  any  prolix  ol 

7t  6  1 1  if  i  salisfvim:  the  ul'ore  rcurn  lions,  then  <  \isl  I, .  f-  €  A ,  sin  Ii  that 
C t  i i  .  b :  (  —  1  -V //  log  I  I  i  l.  I  hi '  is  lino.  (i(ii  it  ii  <  rcMrit  I  t ,  to  he 

(■(/mil  lo  lln  uh  mu  all)  :ei  o  Jam  lion 

Proof.  Using  an  argument  similar  to  l.ernma  2  1.  it  is  sufficient  to 
prove  the  result  under  the  restriction  that  /,  =  (land  under  the  restriction 
that/;  be  differentiable  and  bounded,  together  with  every  component  of 
its  derivative,  bv  c 1  -' .  Using  the  results  of  Nenuro' sky  and  Yudin  ( IVK.'i. 
for  processor  /’,  to  determine  a  point  which  is  optimal  within  r.  it  must 
acquire  nontrivial  information  on  the  values  and  the  derivatives  of for  at 
least  An  log'  I  r '  - 1  different  points  Note  that  the  ( >(  login  1 : most  signifi¬ 
cant  bits  of./;  and  each  component  of  its  derivative,  evaluated  at  anv 
point,  are  always  zero.  Thus,  for  processor  /’,  to  obtain  nontrivial  infor¬ 
mation  at  a  certain  point  at  least  (hn  logi  In' :  >i  bits  have  to  be  transmit¬ 
ted.  This  leads  to  a  total  communication  requirement  of  Ouv  log/ 1  r'  N) 
=  Oof  log-t  bell  bus.  which  proves  the  result.  • 

If  we  relax  the  requirement  th.it  the  same  number  of  bits  be  transmitted 
for  each  component  of  the  gradient,  at  each  stage,  then  the  same  proof 
yields  the  lower  bound  O/i../;:  n.  n  -  An  log/  I  «  1 

IV.  An  Optimm.  Ak.okithm  ior  tut  Oni -Dimi  nsionxi  Cxst 

We  prove  here  a  result  which  closes  the  gap  between  upper  and  lower 
bounds  for  the  one-dimensional  case.  The  proof  consists  of  the  construc¬ 
tion  of  an  optimal  protocol.  We  only  present  the  protocol  under  the  as¬ 
sumption  that  each  /  is  differentiable.  The  argument  is  the  same  in  the 
nondifferentiable  case,  except  that  each/;  is  to  be  interpreted  as  a  subgra¬ 
dient. 

Proposition  4  1  If  n  =  I  then  Ci  f,  :  e)  £  otlogt  bell. 


T  si  I  s|  kl  is  \\[>  1  l  (» 
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/Vo,-/.  The  protocol  consists  of"  consecutiv e  stages  -\t  the  beginning 
of  the  A  t h  stage.  both  processors  have  knowledge  of  four  numbers,  a.  .  b.  . 
c  , .  and  i .  with  the  following  properties: 

id  The  inters al  |u. .  b.  1  contains  a  point  x *w hich  minimizes  /,  -  /'. . 
(ill  The  derivative  of/,  at  anv  mimmi/er  of  1,  -  i:  and  the  deriv atix e 
of  I]  and  of  -  h  at  ink  -  b,  I  2  belong  to  the  interx  al  |( . .  d.  ].  (  Note  that  the 
derivative  of  each/  has  to  be  constant  on  the  set  of  minimizers  off,  -  /.I 

At  the  first  stage  of  the  algorithm  w  e  start  w  ith  u,  =  0.  b,  =  1 .  <  ,  =  -  I . 
and  J,  =  \ .  At  the  A  t h  stage,  the  processors  do  the  follow  ing:  processoi  F, 
transmits  a  message  in  .  =  0  if  ( -  1 1"  1  f'.Uti.  -  h,  >  2 1  s  <<,  +  i/,i2: 
otherwise  it  transmits  in  .  =  I 

If  m |  „  =  0  and  ni: .  =  I .  then  I  [(<<«.  -  />,  I  2i  -  //(w,  -  />.  i  2i  £  0  We 
max  then  let  n..,  =  (</.  -  b,  I  2  and  leave  d.  unchanged.  Similarly, 

if  in i  ,  =  I  and  up  |  =  0.  w  e  let  />, .  ,  =  (</,  -  b.  i  2  and  leav  e  </..<. .  J. 
unchanged. 

We  nxwv  consider  the  case  in,  .  -  up  .1  Let  x  ■  be  a  mimmi/er  x>t  I ,  - 
belonging  to  |i/.  ./>.].  If  x '  -  ta.  *  />.  i  2.  then  t  j  x  ’  i  /  do,.  —  />.  i  2 )  - 
It .  -  »/.  I  2.  If  x  "  ‘  (</.  -  />.  I  2.  then  /  a  x '  i  -  -  f.ix'i  .•  -  /"-Ho.  *  />. )  2 1  ..* 

(< .  *  </.,  >  2  In  either  case,  w e  max  let  < . .  i  -  u  .  *  «/.  i  2  and  leav e  n.  .  h. . 

d.  unchanged,  Finally.  if  up  .  =  up,  -  0.  a  similar  argument  shxnvs  that 
we  max,  let  d..  -  i< .  -  ,/.  I  2  and  leav  e  a.  .  h. .  < .  unchanged. 

Lor  each  of  the  font  cases,  we  see  that  .  d.  will  preserve 

properties  in.  nn.  which  were  postulated  earlier  Furthermore,  at  each 
stage,  either  h.  -  a.  or  d.  c .  n  halved.  Therefore,  after  at  mx'st  k  =  2 
logt  I  r>  stages,  we  reach  a  point  where  either  h.  -  u.  £  r  or  d.  -  iks  f.  If 
-  n.  f.  then  there  exists  a  mimmi/er  which  is  within  r  of  n.  :  given 
that  the  derivative  x>f  t  -  /  n  bounded  by  xxne.  it  folUxws  that /,(</.  i  + 
.fan,  I  comes  w  ithin  r  of  the  optimum,  as  desired  Alternativ  ely .  if  d.  -  < , 
£  r.  then  t  ]Uu.  -  b.  i  2 '  *  l:u,i.  -  />.  1 2 1  ::  d.  -  < .  s  r.  It  follows  that 
for  any  \  £  |0.  1|.  we  have  Ltxi  -  M  x  i  -*  /,<(<,.  -  />,  i  2 1  -  /;n <;.  -  b.  I  2 1  — 

x  -  lu.  *  h.  I  2  r.  vx  hich  shx'w  s  that  i  /  -  /  it i  (i.  -  l\  »  2 1  comes  \x  nhin  r  of 

the  optimum.  ■ 


V.  An  Almost  Optim m  Pro  tot  cm  i or  Strosc.i  v 
Conn  t  x  Pkohi  t  Ms 

^ e  consider  here  the  class  w/  of  strongly  convex  functions  which 
w  as  defined  in  Section  1 1 1  as  the  set  of  continuously  differentiable  convex 
functions  satisfying  (2.1i-(2.2i.  In  this  sectixvn  we  show  that  a  suitable 
distributed  v  ersion  of  the  gradient  projection  algorithm  cc>mes  close  to  the 
lower  bound  of  Proposition  2  3.  within  an  (dik>g  ii\  factor  In  particular. 


(OMMLMC  V  I  ION  (  UMI*1  EMM 


for  am  fixed  dimension  n.  we  have  a  protocol  whose  dependence  on  r  i' 
optimal 

In  the  protocol  to  he  considered  each  processor  computes  the  same 
sequence  |x.  ,  of  elements  of  |(>.  I|'  according  to  the  iteration 


We  use  the  notation  [  s  j  to  denote  the  projection  twin  respect  to  the 
Kuclidean  metric  i  of  a  \  ector  x  •£  :>i'  onto  the  conx  e\  set  a>.  I  j  .  Also.  •>  is 
a  posito  e  scalar  step-ize  and  >.  is  an  approximation  ot  the  gradient  of  /,  - 
/ : .  e x  .il uated  lit  i.  .  In  particular,  we  let  e.  =  /  a  1. 1  *  / -i  x.  i  and  we  require 

that  \ .  satisfx 


where  <,  is  some  positive  constant,  independent  of  k.  belonging  to  id.  1 1 
NaturalK  .  we  will  have  toensuie  that  thete  i'  enough  communication  so 
that  each  prosessor  knows  at  the  beginning  ot  the  *.th  stage 

We  start  bv  estimating  the  number  of  steps  requited  h\  the  above 

algorithm  tv'  come  tv'  a  small  neighbor  hood  of  the  optimal  point.  The 

argument  is  xerv  similar  to  the  standard  proof  that  the  gradient  projection 
algorithm  has  a  linear  rate  of  convergence  t Nemitov sk>  and  Yudin.  1  VS.' . 
pp.  25S-2hO)  except  that  we  need  to  take  cate  of  the  fact  that  we  use  v 
instead  of  the  exact  gradient  e.  We  denote  b\  i  the  unique  vectv'r  in  |o 
IJ'  which  minimizes  •  -  over  that  domain  >  1'niqueness  is  a  conse¬ 

quence  of  strict  conxexitv.  which  follows  from  strong  Cv'nvexitv.i 

Proposition  5  I  It  f  £  mi-  i!  x. .  '.  vat/v/v  if  1 1 — * 5  2 1 .  if  ;lu 
\U'ps::c  '■  is  \nit.ll  cttnHuh.  tit:.:  it  u  satin  it  tin's  .*•<  I.  tlnn  tin  >< 

(MSI  H.  C  u.  Jt  pt  ihiinc  Oi.-.v  nr.  \I.  /..  sin  h  llhi: 

to  /  <  x .  ■  nr  i  •_  Aim  (5.?  i 

i  li  l  x.  -  x  T  -  Bin.  ' .  (5.4' 

urn  x.  -x.  -  Cii  ;.r  (5.5' 

Prnt  \  f.  We  will  prv've  the  re '  u  It  with  the  following  choices  of  con¬ 
stants  :  w  e  let  >  =  1  I  Ml.  i.  H  =  2  A  /. .  and  (  =  2//  :.  The  constants  A  and  o 
w  ill  be  fixed  later 

We  state  w  ithout  prv'i'f  the  follow  mg  properties  of  functions  in  JSi  x;  l 
iNemirovskx  and  Yudin.  WT  pp.  25-4— 255 ); 

Ili  fix'  -  /(xi  •-  ML  x  -  \  .  <5. hi 


W.Vs’ 


V  "o V  Vg.  V.'.VagaV/.'Y- A-.V  V  V  V.V 

-V  A-%  .V  A  JV  .*•  A  .  -  _  »  A  _*n  A  _S.  .  «  A  A  A  A  /-  A  A  A 
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III  l  M  \  -  \  I  _r  fu  I  -  I  (  1 1  v  -  I  /.  2  I  v  I  ' .  (5. 7 1 

i  iii  i  /( t  -  \  i  •  / u  i  *  ■  ft i  I  v  1  -  ( /.  \/  2 1  \  ■  (5. Si 

We  tv  ill  he  also  using  the  inequality 

n » ■  i  v  -  \  •  i  -  o.  Vt  e  i<>.  i  r .  1 5 . s» i 

which  i'  a  necessary  and  sufficient  condition  for  optimality  of  a'. 

We  continue  with  the  mam  part  ot  the  proof.  which  proceeds  hy  induc¬ 
tion  on  A.  We  first  show  that  part  (n  holds  for  k  =  0.  Using  the  convexity 
of  I.  w  e  hav  e 

7  It  •  I  -  /!  I,  I  -  /  (A, 1 1  t  ’  -  v,  •  /(  i  -  m  \ -  A,,:.. 

I  sing  iZ  2  i  we  see  that  f  iv,i  is  bounded  hv  \fl.n' : :  also,  t'  -  a,. I  = 

>'  is  hounded  hv  /;  It  follow  s  that  1 1  v..  i  -/( v '  1  -  W/./i  ■s  .An.  as  long 

as  A  is  chosen  larger  than  \ll. 

Suppose  now  that  i.'.aI  is  valid  loi  some  nonnegative  integer  A.  L  sing 
t5  ~i  and  then  ly.^i  we  obtain 

/. 

M  I  -•  1 1  \  I  -  I  (  V  ■  I  I.  -  t  ■  -  ^  -  A 


We  now  use  t.'.ldi  and  the  induction  hvpothesis  to  obtain 


v,  --  \  ’  |  ri  i.  i  Ai  i|  ■_  j  Ann  '  =  Hnu  (5. 1  1 1 

W  e  hav  e  therefore  show  n  that  1 5  4i  is  also  v  alid  lor  that  particular  k.  We 
then  use  o'  4i  and  the  tnangle  inequalitv  to  obtain 


i. .  ;  -  a.  :  i. 


which  proves  <5.5*  for  that  same  value  of  k 

W  e  now  pros  e  i5..’ •  for  A  -  I .  w  hich  w  ill  complete  the  induction.  Lsing 
the  definition  of  the  protection,  v. . :  mininn/es  \  -  v.  ~  ■  over  v  E 

|0.  1 1".  which  is  equivalent  to  minimizing 


/(A.  I  -  0V.  v  -  -  r—  v  -  V. 

2\ 
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over  all  v  £  |0.  I  ]'  L  e!  us  use  the  notation  J. (  \  i  to  denote  the  expression 
<5  is)  as  a  function  of  \ .  Let  r  =  a.  -  (1  \/iia*  -  aj  Note  that  ;  € 
|0.  1 1  because  a.  .  »*  belong  to  [0.  I]'  Thus,  hv  the  minimizing  property 


of  v.  .  we  have 

J.  1  V. 

.  , )  tL  J,  t ;  1 . 

(5.14 

Now . 

!l\..,l  '  /I  1  -  'V. 

.  -  l  -  A .  - 

M/ 

—  -  V.;  - 

IM 

c  tn.i  *  v.  - 1  -  ‘  *  -^p- 

».  1  -  A. 

;  -  0.  '  a,  .  ,  -  A  J 

S  J.l A..,)  * 

(/!  '  :a‘ )( 2B 1 

■n 1  :q  ‘ ) 

•_  ./.(; 

i  -  2/f  '  nn 

1 

=  t  I  A.  )  -  1  A.  —  ( A  -  V.  )  ! 

IM  1 

T"  JT- 

t  -  a.  || :  -  2/f  : na'-' 

1 

•C:  /I.V.  )  *  \  C.  —  1  V  *  -  A. 

1, 

’  '  •  2  Tl" 

-  a,  -r  2/f  'nn'-' 

1 

-  W  -  C.  T{  )■  -  A. 

,  1  ,  ,  11. 

•-ill  -  jylyiA.I  -  Xj\tix- 

1  ~  \  ' 

t-  .  ; 

-  s''  -  A.  ■ ; 

-  '-Jin'’ 

r "  -  f/1'1'-’ 

1 

-  -/IV 

-  nn:'. 

Here,  the  first  inequality  followed  from  l5.8):  the  second  from  the 
Schwarz  inequality :  the  third  from  « 5 . 2 ) .  (5  1 2 1 .  and  the  definition  of 
y.iA. .  c  the  fourth  from  if.  14'.  In  the  equality  we  made  use  of  the  defini¬ 
tion  of ;  and  J. .  and  the  next  step  followed  from  the  Schwarz  inequality  : 
then,  we  used  the  fact  M  2  I.  (52).  and  i.vlli:  finally.  the  last  line 
followed  from  i5.~i.  We  therefore  have.  using  the  induction  hypothosis. 

/i  v. . ; )  -  U  \  ’  i  n  I  I  -  I  i  /(a.  )  -  /t  v*  ))  -  off 

I  .  2.4  1  : 

•_  1  I  -  —  I  Ann -  0  (—  I  /) o:‘. 


(5. 15) 
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The  induction  will  he  completed  il  the  right-hand  side  of  (5  I5t  i'  sm.ilie: 
th.rn  .4 /in' '  ‘  This  is  accomplished  h>  taking  a  £  (d.  1 1  close  enough  to  I 
so  that  I  I  M  ■  a  :  and  then  choosing  4  large  enough  so  that  the  term 
involving  V:  is  negligible  in  comparison  with  the  first  term  in  the  right- 
hand  side  of  i  '  l'i  [his  concludes  the  proof  ■ 

We  now  return  to  the  distnhuted  protocol  Since  f:  €  i.  ,  .  it 
follows  that  1 1  -  I-  £  •  ,  i.  Consequently, .  I’roposition  5.1  applies  to  r, 
-  r;  and  show  s  that  aftei  (hlogili)  -  log  in  stages,  the  algorithm  (5. 1  >- 
i  5 . 2  >  reaches  a  point  which  is  within  >  trom  opnniahtv 

We  now  indicate  how  the  pioiocol  ma>  he  implemented  with  (' m  log  /:i 
hits  being  communicated  at  cavh  stage  All  we  need  to  do  is  t o  make  sure 
that  the  processors  share  enough  information  at  each  stage  to  he  able  to 
compute  a  vector  v.  satisf-.  mg  p  2'  This  is  accomplished  bv  letting  each 

piocessor  know  a  set  of  s.. alars  v. i/.  /  i.  /  =  1.2./-  I . /:.  such  that 

sji.j  i  -  g.  ti.  1 1  •_;()•.  w  here  e.i ;.  / 1  is  the  / 1 h  component  of  r  u.i.  \\  e 
first  consider  stage  A  -  it  Csing  i2  2i  we  see  that  e  i/.  / 1  is  hounded  hv 
On:'  fot  each  i.j.  Thetcfore.  it  is  sufficient  to  transmit  Otlog  >:■  hits,  to 
specifv  each  component  with  accuracv  n'  =  ]. 

Suppose  now  that  /  (land  that  quantities  ,i/.  it  vvith  the  desired 
properties  hav  e  been  shared  at  stage  A  I.  We  have  c  .•/../ )  —  '  ti.it  ■ 
C.i/./i-  C.  i,. /I  •  c.  ti.it-  s.  a:.  ;  t  •_  /.  \f  \,  -v.,‘-  /:  •«* ' 

i  /.VO:  -  i:  Tu'  'Here  we  have  made  U'^  of  (5  M.  our  hvpothe'i' 

that  v.  satisfies  1 5.2 1.  and  part  <  1 1 1  •  of  Proposition  5. 1 .  i  l  et  us  impose  the 
additional  requirement  that  v.i;.  ;  i  he  an  integei  multiple  of  u'  This  re- 
qunemeni  does  not  prohibit  the  attainment  of  our  goal,  vv  hich  i'  to  sati'tv 
inequaiitv  1 5 . 2 1 .  Wit  h  this  requirement .  there  ;ue  at  mo'!  <»  tL.MCn  -  - 
1 1  *  I  possible  choice'  for  v.i /.  /  i.  Iherefoie.  each  processoi  /’  mav 
choose  v.i:.  jt  as  above  and  transmit  its  value  to  the  other  piocessor. 
while  communicating  onlv  /Xlog  n  hits  for  each  component  /.  thus  lead¬ 
ing  to  a  total  of  Oti i  log  //i  communications  pet  stage.  We  have  thus 
proved  the  following  lesiilt. 

Proposition  5.2.  /•<>/  mn  ii\tJ  V.  /..  >«<  >m: «  O  *s.  \; :  :  f  >  -  On: 

log  nl  log  r.  -  log'  I  r 1 1 1 


\l  PosSIHI  I  h\l|Ns|OSs  \N[>  ( )p|  s  Oi  t  si  IONS 

I.  The  protocol  of  Section  V  is  fikelv  to  he  tar  from  optimal  concern¬ 
ing  the  dependence  on  the  parameters  V  and  L  The  gradient  algorithm 
tends  to  he  inefficient  for  poorlv  conditioned  problems  (large  M  >.  as  op¬ 
posed  to  variations  of  the  conjugate  gradient  method  (Nemirovskv  and 
Yudin.  1%?I  It  remains  to  he  seen  whether  a  suitable  approximate  ver¬ 
sion  of  the  conjugate  gradient  method  admits  a  distributed  implementa¬ 
tion  with  low  communication  requirements  as  a  function  of  M 
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2  For  the  cIjsn  v  .  ei.ulicnt  nicihods  do  no*.  ^ork  *inJ  the  e.ip  he- 
lv\ cen  the  lowei  bound  of  Section  II  and  the  uppei  bound  of  Section  III 
remains  open  We  hehe\e  that  the  f.ielor  of  n'  in  the  uppei  bound  cannot 
be  tedtieed  I  he  reason  is  that  an\  concei'able  algorithm  would  need  to 
consider  at  least  (to,  log'  I  .■  u  points  and  it  is  h..:J  to  imagine  atv\  uset.d 
transfer  of  information  concerning  the  behaxiui  ot  the  function  in  tire 
\ic i n 1 1 v  of  a  point  which  does  not  require  (>(/»>  messages.  On  the  other 
hand,  it  nta\  be  possible  to  reduce  the  factor  log-i  1  fi  to  jus:  logi  I  i< 
although  we  do  not  know  how  to  accomplish  this.  A  related  open  problem 
concerns  the  (dtlog  to  gap  between  Propositions  .'.2  and  2..'.  fern  the  class 

v  Some  directions  alone  which  it  is  1 1 k e I >  that  the  tesult'  can  be 
extended  concern  the  ease  of  k  2  processors  and  the  ease  where  the 
constraint'  under  w  huh  the  optimization  is  earned  out  are  not  common!) 
know n:  for  example,  we  ma>  ha\ e  a  constraint  ot  the  form  c  m  -  e.-tvi  - 
li.  where  each  c  is  a  convex  function  known  b\  processor  /' 
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